Preprocessing for the Automated Transcription of Polyphonic Music: Linking Wavelet Theory and Auditory Filtering

نویسندگان

  • Rolf Wöhrmann
  • Ludger Solbach
چکیده

In this paper a fast method for the calculation of a linear time-frequency distribution based on the gammatone lter auditory model is introduced as a preprocessing step for the automated transcription of music and auditory source separation. Examples show that this method has a promising potential for the analysis of music pieces with a limited spectral overlap of the diierent signal components. In the past few years wavelet transforms have become an important tool for signal processing. See (Rioul and Vetterli, 1991) for an overview. An important property of both the continuous wavelet transform (CWT) and the short-time Fourier transform (STFT) is their linearity, which makes them more suitable for the analysis of multicomponent signals than quadratic time-frequency distributions (TFD) suuering from cross-term artefacts. The CWT is given by a dd; a > 0: (1) For admissibility of a time function g(t) as a mother wavelet, it must satisfy Z 1 ?1 jg(t)j 2 dt < 1; and G(0) = 0; (2) where G(f) is the Fourier transform of g(t). Functions satisfying these conditions look like short waves, which has been the reason for naming them wavelets. The main diierence between the STFT and the CWT lies in the fact, that for the STFT the analysis window remains unaltered, whereas the CWT window changes its scale due to the scaling factor a. The quasi-logarithmic organization of musical scales and of the frequency resolution in the human cochlea makes the CWT a more appropriate TFD of acoustic signals than the STFT. Since eq.1 cannot be evaluated everywhere, it has to be modiied by picking certain xed values for a and b yielding a discrete approximation of the CWT. Furthermore, the wavelets used throughout our work are close to being analytic (progressive), that is they satisfy 8f < 0 : G(f) 0. Thus, given the complex-valued lter outputs the instantaneous frequencies can be estimated from the phases and the signal envelopes can be estimated from the moduli, if the signal components have a negligible overlap. Due to the high computational burden of the quasi-continuous CWT, innnite impulse response (IIR) realizations like our gammatone approach are desirable. As stated by Patterson (1992) the gammatone lter can be a good approximation of the ltering in the human cochlea if the parameters are properly adjusted. See g.1 for an example of a gammatone impulse response. A quasi-analytic version of the gammatone lter is of the …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Musical Audio for Polyphonic Transcription 1st Year Report

This report centres around some of this issues involved in automatic transcription of polyphonic musical audio signals. That is, representing the information contained in the audio in such a way as to be recognisable and usable by a musician. First, a review of the various fields which have a bearing on the subject is put forward, including music, music psychology, auditory psychology and signa...

متن کامل

An Analysis of Achievement of the Philosophical Sense of “Extension” in Music, with Interpretaion of Ibn-e Sina’s Explanation an Extension

This research can be considered as one of the studies that seek to explore, in an argumentative way, subtle and solid philosophical concepts in the field of art. The paper provides an analysis of the concept of “extension” in music as one of the most thought-provoking philosophical concepts. The analysis is carried out by interpreting Ibn-Sina’s special conception of musical extension to answer...

متن کامل

Automatic Polyphonic Piano Music Transcription by a Multi-classification Discriminative-Learning

In this paper we investigate on the use locally recurrent neural networks (LRNN), trained by a discriminative learning approach, for automatic polyphonic piano music transcription. Due to polyphonic characteristic of the input signal standard discriminative learning (DL) is not adequate and a suitable modification, called multi-classification discriminative learning (MCDL), is introduced. The a...

متن کامل

The complex - valued continuous wavelet transform as a preprocessorfor auditory

In this paper we draw links between the widely used gammatone lter auditory model and wavelet theory. From the viewpoint of wavelet theory the beneet from linking these research elds is a fast method for the computation of a timescale representation. From the viewpoint of auditory ltering the beneets are the existence of methods for the detection of signal singularities and for resynthesis. Our...

متن کامل

Automatic Transcription of Pitch Content in Music and Selected Applications

Transcription of music refers to the analysis of a music signal in order to produce a parametric representation of the sounding notes in the signal. This is conventionally carried out by listening to a piece of music and writing down the symbols of common musical notation to represent the occurring notes in the piece. Automatic transcription of music refers to the extraction of such representat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995